Global efficiency of local immunization on complex networks 



Laurent Hebert-Dufresne, 1 Antoine Allard, 1 Jean-Gabriel Young, 1 and Louis J. Dube 1 

1 Departement de Physique, de Genie Physique, et d'Optique, 
Universite Laval, Quebec (Quebec), Canada G1V 0A6 
(Dated: August 30, 2012) 

Epidemics occur in all shapes and forms: infections propagating in our sparse sexual networks, information 
spreading through our much denser social interactions, or viruses circulating on the Internet. With the advent of 
large databases and efficient analysis algorithms, these processes can be better predicted and controlled. In this 
study, we use different characteristics of network organization to identify the influential spreaders in networks of 
diverse nature. We propose a local measure of node influence based on the network's community structure which 
is easily estimated in real systems and frequently outperforms the usual measure of a node's importance. More 
importantly, through an extensive study spanning 17 empirical networks and 2 epidemic models, we formulate a 
readily applicable approach which proves efficient even though different networks and different diseases require 
different strategies. This research is expected to guide efforts regarding public health policies, computer network 
security and the control of ecological systems. 



Epidemics seldom occur randomly. Instead, they follow the 
structured pathways formed by the interactions and connec- 
tions of the host population |[T]|2l. The spreading processes 
relevant to our everyday life occur on networks of all nature: 
social (e.g. epidemics [3] |4j), technological (e.g. computer 
viruses |E] [6j ) or ecological (cascading extinctions in food 
webs 0). With the network representation, these completely 
different processes can be modelled as the propagation of a 
given agent on a set of nodes (the population) and links (the 
interactions). Different systems imply networks with differ- 
ent organizations, just as different agents require different epi- 
demic models. 

There has long been significant interest in identifying the 
influential spreaders in networks. Which nodes should be the 
target of immunization efforts in order to optimally protect the 
network against epidemics? Unfortunately, most studies fea- 
ture two significant shortcomings. Firstly, the proposed meth- 
ods are often based on optimization or heuristic algorithm re- 
quiring a near perfect and static knowledge of the network 
(e.g. ||8] [9]), which is seldom the case. Secondly, methods 
are usually tested on small numbers of real systems using a 
particular epidemic scenario (e.g. IfTTJl [TTI0 . thus restraining 
the scope of possible outcomes. However, numerical research 
based on empirical data allows to take into account the diver- 
sity of contact networks found in nature, thus including struc- 
tural features that cannot be modelled analytically or that are 
simply unknown. 

We here present such a numerical study, perhaps the largest 
of its kind to date, where we argue that, depending on the na- 
ture of the network and of the disease, different immunization 
tactics have to be taken into consideration. In so doing, we 
formalize the notion of node influence and illustrate how lo- 
cal knowledge around a particular node is usually sufficient to 
estimate its role in an epidemic. Far from trivial, it follows 
from this finding that an efficient immunization strategy can 
be obtained only from local measures, which are easily esti- 
mated in practice and robust to noisy or incomplete informa- 
tion. Yet, our main contribution is to illustrate how, in certain 
cases, the influence of a node is not necessarily dictated by its 



connectivity, but rather by its role in the network's community 
structure (see Fig. [TJ. 



RESULTS 

Models and measures 

There exist two standard models emulating diverse types 
of epidemics: the susceptible-infectious-recovered (SIR) and 
susceptible-infectious-susceptible (SIS) dynamics. In both, an 
infectious node has a given probability of eventually infecting 
each of its susceptible neighbors during its infectious period, 
which is terminated by either death/immunity leading to the 




FIG. 1. Protein interactions of S. cerevisiae (subset) 1221 . The 
three black nodes correspond to the ones with the highest degree, 
and the three red ones have the highest membership number. In this 
particular example, it is straightforward to conclude that the latter are 
structurally more influent. 
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recovered state (SIR) or by returning to a susceptible state 
(SIS). In the SIR dynamics, for a given transmission proba- 
bility T, the quantity of interest is the mean fraction Rf of 
recovered nodes once a disease, not subject to a stochastic ex- 
tinction, has finished spreading (i.e. we focus on the giant 
component [12]). As each edge can only be followed once, 
this dynamics allows to investigate how a population is vul- 
nerable to the invasion of a new pathogen. In the SIS dynam- 
ics, we are interested in the prevalence /* (fraction of infec- 
tious nodes) of the disease at equilibrium (equal amounts of 
infections and recoveries) as a function of the ratio A — a I [3 of 
infection rate a and recovery rate f3. This particular dynamics 
permits the study of how a given network structure can sustain 
an already established epidemic. 

Should a fraction e of the population be fully immunized, 
our objective is to identify the nodes most capable of mini- 
mizing Rj and 7*. The epidemic influence of a node — that 
is the effect of its removal on R f and I* — depends mainly 
on its role in the organization of the network. Hence to ef- 
ficiently immunize a population, we must first understand its 
underlying structure. 

Network organization can be characterized on different 
scales, each of which all somehow affect the dynamics of 
propagation. At the microscopic level, the most significant 
feature is the degree of a node (its number of links, noted k) 
which in turn defines the degree distribution of the network. 
The significance of the high-degree nodes (the hubs) for net- 
work structure in general [13], for network robustness to ran- 
dom failure [ 14] and for epidemic control lfT5ll has long been 
recognized. 

At the macroscopic level, the role of a node can be de- 
scribed by its centrality, which may be defined in various 
ways. Frequently used in the social sciences is the between- 
ness centrality (b), quantifying the contributions of a given 
node to the shortest paths between each pair of nodes in the 
network lfl6l . Arguably, this method should be among the best 
estimate of a node's epidemic influence as it directly measures 
its role in the different pathways between all other individ- 
uals IfPTll . yet at a considerable computational cost. A sim- 
pler method, the k-core (or k-shell) decomposition ifTSl PT9l . 
assigns nodes to different layers (or coreness c) effectively 
defining the core and periphery of a network (high and low 
c respectively). It has recently been shown that coreness is 
well suited to identify nodes that are the most at risk of be- 
ing infected during the course of an epidemic EUll . In light 
of our results, we will be able to discuss the distinction be- 
tween a node's vulnerability to infection and its influence on 
the outcome of an epidemic. 

The mesoscopic scale has recently been the subject of con- 
siderable attention. At this level of organization, the focus 
is on the redundancy of connections forming dense clusters 
referred to as the modularity or community structure of the 
network ETI l22l . Nodes can here be distinguished by their 
membership number m, i.e., the number of communities to 
which they belong. We call structural hubs the nodes con- 
necting the largest number of different communities. These 



nodes act as bridges facilitating the propagation of the disease 
from one dense cluster to another. Targeting structural hubs 
to hinder propagation in structured populations has been pre- 
viously proposed and investigated IfTOl fTTTl . but has yet to be 
tested extensively. 

Note that the microscopic and mesoscopic levels are char- 
acterized by local measures in the sense that they do not re- 
quire a complete knowledge of the network, in contrast to 
global measures like the betweenness centrality. However, in 
numerical simulations, the network is static, fully known and 
there are no time constraints, allowing us to compare global 
measures with local measures, which are probably best suited 
in practice, in a variety of epidemic scenarios. 

We therefore ask without discrimination: which of the de- 
gree, the coreness, the betweenness centrality or the member- 
ship number is the best identifier of the most influential nodes 
on the outcome of an epidemic? To answer this question, we 
have simulated SIR and SIS dynamics on 17 real-world net- 
works in which a fraction e of the nodes was removed follow- 
ing the descending order of nodes' score for each of the four 
different measures. By comparing their efficiency to reduce 
Rf or 7* as a function of e, we are able to establish which 
measure is best suited for a given scenario characterized by a 
network structure, a propagation dynamics and a disease vir- 
ulence (i.e. probability of transmission). 

Case study: a data exchange network 

We first illustrate our methods using the network of users 
of the Pretty-Good-Privacy algorithm for secure information 
interchange (hereafter, the PGP network) ||23l , which could 
be the host of the propagation of computer viruses, rumors or 
viral marketing campaigns. Results for the 16 other networks 
are presented and discussed in the next section as well as in 
the Supporting Information Appendix (SI). 

Communities in the network are extracted with the link 
community algorithm of Ahn et al. ED . This algorithm 
groups links — and therefore the nodes they join — into com- 
munities based on the overlap of their respective neighboring 
nodes. This overlap is quantified through a Jaccard coefficient, 
and two links are grouped into the same community when 
their coefficient exceeds a certain threshold. This threshold 
value acts as a resolution, enabling to look at different lev- 
els of organization. As suggested in Ref. ||2D . the value of 
the threshold is chosen to maximize the average density p of 
the communities (see Material and Methods). As this choice 
may seem arbitrary, Fig. [^investigates the similarity between 
the nodes with the highest membership numbers, for differ- 
ent thresholds. This suggests that the membership number is 
fairly robust around the threshold values yielding communi- 
ties with high densities, for denser communities require sig- 
nificant change in resolution to break apart. Moreover, Fig. [2] 
also demonstrates that the effect of the removal of the struc- 
tural hubs on a SIS epidemics is very robust to the choice of 
the threshold. Thus, we will henceforth use the membership 
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FIG. 2. Community structure of the PGP network, (top) Commu- 
nity density (p) obtained through different Jaccard thresholds, (mid- 
dle) Robustness of the structural hubs identification methods. Ele- 
ment (i, j) gives the overlap (normalized) between the structural hubs 
(top 1%) selected with thresholds i and /, The highest line (and last 
column) of the matrix corresponds to the case where the membership 
number equals the degree, (bottom) Prevalence /* of SIS epidemics 
with A = 5 when the top 1% of structural hubs are removed (com- 
pared with the results without removal in blue or with random targets 
in red). 



numbers obtained with the threshold value corresponding to 
the highest community density. 

If any, the differences between the efficiency of the differ- 
ent methods are due to the immunized nodes not being the 
same. Figure [3] investigates the correlations between the dif- 
ferent properties (k, b, c and m) of each node. Perhaps the 
most important result here is that nodes with a high mem- 
bership number may have relatively small degree, coreness 
and betweenness centrality. Hence, we expect the immuniz- 
ing method based on community structure to have a different 
influence on the outcome of epidemics. These correlations are 
further investigated in the SI. 

To investigate various epidemic scenarios, we consider both 
SIS and SIR dynamics (which may behave quite differently) 
with various values of transmission probability (A and T for 
SIS and SIR respectively). In fact, each network feature an 
epidemic threshold, i.e. critical values A c Il24l and T c ll25l . 




FIG. 3. Node cloud of the PGP network. We investigate correla- 
tions between the degree (k, right axis), the coreness (c, left axis), the 
betweenness centrality (b, vertical axis) and the membership number 
(m, color) for each nodes. Each measure is normalized according to 
the highest value found in the network. Each node is represented in 
this 4-dimensional space and a simple triangulation procedure then 
yields a more intelligible structure. Note that some structural hubs 
(dark red) can be found even at relatively small degree (~ £ max /2), 
coreness (~ c max /5) and centrality (~ £> max /3). 



below which /* and R f vanish to zero in an equivalent infinite 
network ensemble. As we will show, the observed behavior 
can differ significantly depending whether A and T are close 
to their critical value. 

Figure |4] presents results of different immunization meth- 
ods against SIS dynamics for different values of A. On the 
top figure, where A is near Ac, the most successful method of 
intervention is to target nodes according to their degree. At 
low virulence, the disease follows only a very small fraction 
of all links. The shortest paths are seldom used and the poor 
performance of betweenness centrality follows. Moreover, the 
disease will not be affected by the communities, because even 
in dense neighborhoods, most links will not be travelled. We 
then say that the disease, unaffected by link clustering, fol- 
lows a tree-like structure (without loops), where community 
memberships are insignificant. It is therefore better to simply 
remove as many links as possible. 

As A departs A c , we see that immunization based on mem- 
bership numbers quickly outperforms the other methods. As 
more links are travelled, the disease is more likely to fol- 
low superfluous links in already infected communities. Hubs 
sharing their many links within few communities are there- 
fore not as efficient in causing secondary infections as one 
might expect. Similarly, targeting through betweenness cen- 
trality also performs better with higher A, albeit not as well as 
membership-targeting in this case. For A » A c , we see that 
immunization based on membership numbers (local) and on 
betweenness centrality (global) converge toward similar effi- 
ciency, significantly outperforming degree-based immuniza- 
tion. 
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FIG. 4. Efficiency of the immunization methods against an SIS 
epidemics on the PGP network. Nodes are removed in decreas- 
ing order of their score according to each method: coreness (green 
pentagons), degree (black circles), betweenness centrality (blue tri- 
angles) and memberships (red diamonds) and the effect of removal 
is then quantified in terms of the decrease of the prevalence /*. The 
prevalence of the epidemics when the removed nodes are chosen at 
random (grey squares) has been added for comparison. Figures are 
presented in decreasing order of virulence (A) from top to bottom. 



Another interesting feature of our results is the poor per- 
formance of immunization based on node coreness. A pre- 
vious study had clearly shown that epidemics mostly flour- 
ished within the core of the network (see Fig. [5]l because 
of its density ll20l . Ironically, this density also implies re- 
dundancy. While the core nodes are highly at risk of being 
infected, their removal has a limited effect because there ex- 
ist alternate paths within their neighborhood: the core offers 
a perfect environment to the disease and is consequently ro- 
bust to node removal. It is therefore more effective to stop the 
disease from reaching, or leaving, the core by removing the 
nodes bridging other neighborhoods (i.e. the structural hubs). 

Similar conclusions are drawn for the SIR dynamics. As 
T moves away from T c , the most significant level of organi- 
sation shifts from degree-centric (microscopic) to community- 
centric (mesoscopic) as membership-based immunization out- 
performs the other strategies. 




FIG. 5. k-core decomposition of the PGP network. Representa- 
tion (based on |26|) of the k-shells in the PGP network with nodes 
colored according to their total infectious period during a given time 
interval. Red nodes are more likely to be infectious at any given time 
than green nodes as the color is given by the square of the fraction of 
time spent in infectious state. Note how the central nodes (the core) 
of the network are most at risk. 

Results on networks of diverse nature 

In this section, we highlight different behaviors observed 
in social, technological and communication networks using 7 
other datasets (full results for the 17 datasets are available in 
the SI): subset of the World Wide Web (WWW) d, Math- 
SciNet co-authorship network (MathSci) [27], Western States 
Power Grid of the United States (Power Grid) ||28||, Internet 
Movie Database since 2000 (IMDb) |]29l, cond-mat arXiv co- 
authorship network (arXiv) 11221 . e-mail interchanges between 
members of the University Rovira i Virgili (Email) |[30l and 
Gnutella peer-to-peer network (Gnutella) (3T). 

The results for the WWW, MathSci and IMDb networks 
further support our previous conclusions, with the exception 
that membership-based immunization performs surprisingly 
better than the degree-based variant even near the epidemic 
threshold of the network (see WWW and MathSci). The 
betweenness-centrality-based immunization was not tested on 
IMDb because of computational constraints (its computation 
required over 800 hours with our available ressources and a 
standard algorithm [32 1), which illustrates a significant limit 
of this measure. Approximations could have been used 031 . 
but the intricate (and mostly unknown) relationship between 
the efficiency of the measure and the accuracy of the approxi- 
mation would have only caused additional problems. 

The results presented for the Power Grid network illustrates 
a fundamental difference between the SIS and the SIR dynam- 
ics: while we are interested in the fraction of the network sus- 
taining an established epidemic in the former, it is the fraction 
of nodes invaded by a new disease that is relevant in the latter. 
In fact, the structure of the Power Grid, a chain of small, easily 
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FIG. 6. Efficiency of the immunization methods against SIS and SIR epidemics on several networks. Nodes are removed in decreasing 
order of to their score according to each method: coreness (green pentagons), degree (black circles), betweenness centrality (blue triangles) 
and memberships (red diamonds) to measure efficiency by the decrease of /* or Rf. The size of the epidemics for random removal of nodes 
(gray squares) is added for comparison. Error bars have been omitted for clarity of the SIR results on the Power Grid, but are shown in the SI. 
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disconnected modules, enhances the qualitative discrepancy 
between the epidemic influence of nodes subjected to these 
two dynamics. For the SIS dynamics, the membership-based 
intervention is the most efficient because it weakens all mod- 
ules, limiting the prevalence of the disease. In distinction, 
targeting through betweenness centrality merely separates the 
modules, so that taken individually, they remain infectious. 
For the SIR dynamics, separating the modules is the best ap- 
proach as it directly stops the infection from spreading; while 
weakened - but connected - modules still provide pathways. 
This effect is a direct consequence of the particular structure 
of the Power Grid and is insignificant on other networks. 

Finally, the last set of results, on arXiv, Email and Gnutella, 
presents the effect of the community density p on the perfor- 
mance of membership-based immunization. For very small p, 
the paths within communities do not qualitatively differ from 
the links bridging neighborhoods in their effect on the disease 
propagation. This targeting method is therefore expected to 
converge toward degree-based immunization if m and k are 
strongly correlated. However, as most tested networks had 
significant clustering, p > 30%, the importance of member- 
ships should not be understated. 



measures can be expected to yield the best results in a given 
situation. 

1 . From the degree distribution, estimate the virulence of 
the disease in relation to the epidemic threshold A c E4ll 
or T c (23. 

2. If virulent (A » Ac or T » T c ), evaluate the network's 
community structure; otherwise, go to 4. 

3. If the community density is high (p > 30%), immunize 
nodes according to their memberships; otherwise, go to 
4. 

4. For a virulence near the epidemic threshold, or for 
sparse communities (low p), immunize according to the 
degree of nodes. 

This work is expected to guide immunization efforts toward 
simpler, more precise and efficient strategies. Likewise, the 
introduction of a node influence classification scheme opens 
the door for more research in the hope of finding ever better 
local estimates of a node's role in the global state of its system. 



DISCUSSIONS 



METHODS 



One of the interesting contributions of this work is to offer a 
formal definition of the epidemic influence of nodes, which is 
open to diverse methods of approximation. Our results show 
that standard measures such as the degree are not always the 
best indicators of a node's influence. Moreover, we have high- 
lighted that the coreness, which has recently been proposed 
as an indicator of nodes' influence [20], offers poor perfor- 
mances. This has brought us to distinguish between individual 
risk and global influence. We have also illustrated how a uni- 
versal approach is still wanting, since different networks and 
different diseases require different methods of intervention. 

Consequently, the fact that the number of communities to 
which a node belongs is often an excellent measure of its epi- 
demic influence — one that is at times better, at times equiva- 
lent, but never much worse than connectivity and global cen- 
trality measures — is a particularly important result. Addi- 
tionally, the fact that it is a local measure is especially rele- 
vant considering that we rarely have access to the exact net- 
work structure of a system, either because it is simply too large 
(WWW), too dynamic (email networks) or because the links 
themselves are ill-defined (social networks). Not only are lo- 
cal measures much more robust to noisy and incomplete infor- 
mation, but memberships can also be easier to estimate than a 
node's actual degree. For instance, consider how much easier 
it is to enumerate your social groups (work, family, etc.) than 
the totality of your acquaintances. 

Finally, that two local measures, degree and memberships, 
are sufficient to efficiently immunize networks may well be 
the single most important conclusion of this work. We thus 
offer a simple procedure on how to judge which of these local 



Centrality For all pairs (a, b) of nodes excluding i, list the paths 
between a and b which are of length £ a j, so that there exist no shorter 
paths between a and b. Let n a b (i) be the number of such paths which 
contain i. The betweenness centrality b t of node i is then given by: 



n a j,(i) 



(1) 



Coreness The coreness c, of node i is the highest integer c such 
that the node is part of the set of all nodes with at least c links within 
the set. 

Community detection Two links, ey and e ik , stemming from a 
given node i are said to belong to the same community if their Jaccard 
coefficient /(ey, e ik ) (similarity measure) is above a given threshold 

Jc- 

An B 

Jab = mJb >j - (2) 

where A (B) is the set containing the neighbors of j (k) and includes 
j (k) itself but excludes i. 

Community density The density p, of a community i of /?, > 2 
nodes and d f links is the proportion of the possible redundant links 
— i.e., all links excluding the minimal n, - 1 links that are needed 
for this community to be connected — that do exist: 



Pi 



d\ - (rii - 1) 



- (n, - 1) 



(3) 



The community density p is then calculated according to 

P = ^ diPi ' (4) 

i 

where D is the total number of edges not belonging to single edge 
communities (to avoid penalizing or favorizing non-modular parti- 
tions). 
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Numerical simulations To investigate the fraction of a net- 
work which can sustain an epidemics, SIS simulations start with all 
nodes in an infectious state and are then relaxed until an equilibrium 
is reached. To investigate the mean fraction of a network which a dis- 
ease can invade, SIR simulations start with a single infectious node 
and run until there are no more infectious nodes. Results shown on 
the figures are obtained by averaging over the outcome of several 
numerical simulations. For the SIR dynamics, only the simulations 
leading to a large-scale epidemics (at least 1% of the nodes) were 
considered. 



[1] Caldarelli, G. & Vespignani, A. (2007) Large Scale Structure 
and Dynamics of Complex Networks. World Scientific Publish- 
ing Company, Singapore. 

[2] Keeling, M.J. & Rohani, P. (2008) Modeling Infectious Dis- 
eases in Humans and Animals Princeton University Press, 
Princeton. 

[3] Anderson, R.M. & May, R.M. (1991) Infectious diseases of 
humans: dynamics and control Oxford University Press, New 
York. 

[4] Keeling, M.J. & Eames, K.T.D. (2005) Networks and epidemic 

models. J R Soc Interface 2:295-307 
[5] Pastor-Satorras, R. & Vespignani, A. (2001) Epidemic Spread- 
ing in Scale-Free Networks. Phys. Rev. Lett. 86:3200-3203 
[6] Gomez-Gardenes, J., Echenique, P. & Moreno, Y. (2006) 

Immunization of real complex communication networks Eur. 

Phys. J. B 49:259-264 
[7] Dunne, J.A. & Williams, R.J. (2009) Cascading extinctions and 

community collapse in model food webs. Philos Trans R Soc 

Lond B Biol Sci 364: 171 1 -1723 
[8] Gallos, L.K., Liljeros, F., Argyrakis, P., Bunde, A. & Havlin, 

S. (2007) Improving immunization strategies Phys. Rev. E 

75:045104(R) 

[9] Chen, Y, Paul, G., Havlin, S., Liljeros, F. & Stanley, H.E. 
(2008) Finding a Better Immunization Strategy Phys. Rev. Lett. 
101:058701 

[10] Salathe, M. & Jones, J.H. (2010) Dynamics and Control of Dis- 
eases in Networks with Community Structure PLoS comp. biol. 
6:el000736 

[11] Masuda, N. (2009) Immunization of networks with community 

structure New J. Phys 11:123018 
[12] Newman, M.E.J., Strogatz, S.H., & Watts, D.J. (2001) Random 

graphs with arbitrary degree distributions and their applications. 

Phys. Rev. £64:026118 
[13] Barabasi, A.-L. & Albert, R. (1999) Emergence of scaling in 

random networks. Science 286:509-512 
[14] Albert R., Jeong H. & Barabasi A.-L. (2000) Error and attack 

tolerance of complex networks. Nature 406:378-382 
[15] Pastor-Satorras R. & Vespignani, A. (2002) Immunization of 

complex networks. Phys. Rev. E 65:036104 
[16] Freeman L. (1979) Centrality in social networks: Conceptual 

clarification. Social Networks 1:215-239 



[17] Barthelemy, M. (2004) Betweenness centrality in large complex 
networks Eur. Phys. J. B 38:163-168 

[18] Batagelj, V. & Zaversnik, M. Generalized Cores 
arXiv:cs/0202039vl 

[19] Batagelj, V. & Zaversnik, M. An O(m) Algorithm for Cores 
Decomposition of Networks arXiv:cs/0310049i>l 

[20] Kitsak, M. etal. (2010) Identification of influential spreaders in 
complex networks. Nature Physics 6:888-893 

[21] Ahn, Y.-Y, Bagrow, J. P. & Lehmann, S. (2010) Link communi- 
ties reveal multiscale complexity in networks. Nature 466:761- 
764 

[22] Palla, G., Derenyi, I., Farkas, I. & Vicsek, T. (2005) Uncovering 
the overlapping community structure of complex networks in 
nature and society. Nature 435:814-818 

[23] Boguna, M., Pastor-Satorras, R., Dfaz-Guilera, A. & Arenas, 
A. (2004) Models of social networks based on social distance 
attachment. Phys. Rev. £70:056122 

[24] Hebert-Dufresne, L., Noel, P. -A., Allard, A., Marceau, V. & 
Dube, L.J. (2010) Propagation dynamics on networks featuring 
complex topologies. Phys. Rev. E 82:0361 15 

[25] Newman, M.E.J. (2002) Spread of epidemic disease on net- 
works. Phys. Rev. £66:016128 

[26] Alvarez-Hamelin, I., Dall'Asta, L., Barrat, A. & Vespignani, 
A. (2006) k-core decomposition: A tool for the visualization of 
large scale networks. Advances in Neural Information Process- 
ing Systems 18:41-50 

[27] Palla, G., Farkas, I.J., Pollner, P., Derenyi, I. & Vicsek, T. 
(2008) Fundamental statistical features and self-similar prop- 
erties of tagged networks. New J. Phys. 10:123026 

[28] Watts, D.J. & Strogatz, S.H. (1998) Collective dynamics of 
small-world networks. Nature 393:440-442 

[29] Hebert-Dufresne, L., Allard, A., Marceau, V, Noel, P.-A. & 
Dube L.J. (2011) Structural Preferential Attachment: Network 
Organization beyond the Link Phys. Rev. Lett. 107:158702 

[30] Guimera, R., Danon, L., Diaz-Guilera, A., Giralt, F. & Arenas, 
A. (2003) Self-similar community structure in a network of hu- 
man interactions. Phys. Rev. E 68:065 103(R) 

[31] Ripeanu, M., Foster, I. & Iamnitchi, A. (2002) Mapping the 
Gnutella Network: Properties of Large-Scale Peer-to-Peer Sys- 
tems and Implications for System Design. IEEE Internet Com- 
puting Journal 6:50-57 

[32] Ulrik Brandes (2001) A Faster Algorithm for Betweenness 
Centrality. /. Math. Sociol. 25(2):163-177 

[33] Madduri, K, Ediger, D., Jiang, K, Bader, DA. & Chavarria- 
Miranda, D.G. (2009) A Faster Parallel Algorithm and Effi- 
cient Multithreaded Implementations for Evaluating Between- 
ness Centrality on Massive Datasets. Third Workshop MTAAP 



ACKNOWLEDGEMENTS 

The authors wish to thank Louis Roy for the development of a 
k-core visualization tool; Yong-Yeol Ahn et al. for their link com- 
munity algorithm; all the authors of the cited papers for providing 
their network data; and Calcul Quebec for computing facilities. This 
research was funded by CIHR, NSERC and FRQ-NT. 



